What Is There Between Any Two Nodes in a Complex Network? 



Luciano da Fontoura Costa* and Francisco A. Rodrigues 
Instituto de Fisica de Sao Carlos, Universidade de Sao Paulo, Av. Trabalhador Sao Carlense 4OO, 
Caixa Postal 369, CEP 13560-970, Sdo Carlos, Sdo Paulo, Brazil 

This article focuses on the identification of the number of paths with different lengths between 
pairs of nodes in complex networks and how, by providing comprehensive information about the 
network topology, such an information can be effectively used for characterization of theoretical and 
real-world complex networks, as well as for identification of communities. 

PACS numbers: 89.75.Hc, 79.60.Ht, 45.70.Vn, 89.75.Fb 



INTRODUCTION 

A large number of natural and artificial complex sys- 
tems can be represented and modeled in terms of net- 
works involving interacting components. Such interac- 
tions can range from signalling between cells to social 
contacts (e.g. [1]). Indeed, complex networks theory has 
been considered in a wide range of applications including 
neuronal connections, protein-protein interactions, econ- 
omy, Internet communication and social ties [2], to cite 
just a few possibilities. It was thanks to their flexibil- 
ity and potential for multidisciplinary applications that 
complex networks became so popular and important. 

Much of the efforts by networks researchers have been 
concentrated in developing tools for characterization, 
classification, modeling and simulation. The character- 
ization of network structure is one of the fundamental 
steps of complex networks research, because the model- 
ing, simulation and classification of networks all depend 
strongly on accurate descriptions of the respective topol- 
ogy description [3,4]. In order to quantify different topo- 
logical properties, a large set of network measurements 
has been developed [4]. Many of these features are re- 
lated to the concept of connectivity between nodes, tak- 
ing into account the immediate links between each pair of 
nodes. Several of the measurements currently employed 
in order to characterize network structure - such as de- 
gree, clustering coefficient and shortest path length — 
are ultimately related to pairwise interconnectivity [4]. 
It is therefore important to resort to longer range in- 
teraction between nodes in order to achieve more com- 
prehensive description, characterization and modeling of 
complex structures. The average shortest path length (or 
geodesic distance) between a pair of nodes corresponds 
to one of such measurements [5]. Usually, its average 
value is obtained considering the shortest distance be- 
tween every pair of nodes. Some works have also con- 
sidered distance matrices, containing minimum shortest 
path lengths, in order to enhance the characterization [6- 
9] and identification of isomorphisms [8, 9]. Nevertheless, 
the isolated consideration of these measurement results 
in incomplete network characterization, since important 
information about network structure is not taken into ac- 



count. For instance, the alternative paths between pair 
of nodes whose lengths are larger than the shortest path 
are completely overlooked by more traditional network 
analysis. Thus, two networks presenting the same degree 
and shortest path distributions, but different alternative 
paths organization, can be characterized as being iden- 
tical, which is clearly inappropriate. Also, alternative 
paths can provide additional information about network 
resilience, once they generally reinforce connections, pro- 
viding alternative routes and maximizing the fiow. More 
traditional robustness analysis taking into account just 
the local connectivity and measurements related to the 
shortest paths, such as betweenness centrality [10] and 
efficiency [11], also do not take into account the richer in- 
terconnectivity structure provided by longer alternative 
paths. 

The comprehensive characterization of pairwise con- 
nectivity clearly requires more general approaches con- 
sidering multi-scale interactions extending from the im- 
mediate link to long-range connectivity. The term multi- 
scale refers to the varying topological scales which are 
progressively taken into account around the nodes. The 
traditional approach considers just the first scale {h = 1), 
i.e. immediate neighbor connectivity. In other words, 
in addition to immediate-connection measurements and 
limited long range information such as the shortest paths, 
the identification of alternative paths of any length can 
enhance the network characterization, providing a more 
complete description of network topology. Measurements 
taking into account the successive shortest path lengths 
from a reference node (concentric neighborhoods) have 
been proposed in the literature in terms of hierarchical 
or concentric representations [4, 12-15]. 

The further generalizations of the concepts of connec- 
tivity and interaction necessarily in order to account for 
larger portions of the network, requires the identifica- 
tion of alternative paths between pairs of nodes, as illus- 
trated in Figure 1. Let us suppose we are interested in the 
pairwise interconnection between London (UK) and Lyon 
(France), which is particularly important for those peo- 
ple who have to travel by express train between those two 
cities. If we consider the shortest path approach, just the 
path of length three between those two cities is taken into 
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FIG. 1: The European high-speed rail network connecting 
the main cities of central Europe. While the traditional net- 
work characterization in terms of the shortest distance takes 
into account just one path of length three between London 
and Lyon, the other seven alternative paths are overlooked. 
However, the alternative paths are fundamental for network 
topology and can be associated to important dynamics such 
as traffic jamming and resilience [3]. 

account, while the other seven alternative paths are com- 
pletely overlooked. Nevertheless, such paths are still fun- 
damental for network communication and resilience. For 
instance, if the connection London-Paris-Lyon is blocked 
in any part other than from London to Lille, the train or 
passengers can always take the alternative routes. The 
importance of the identification of alternative paths in 
networks has also been substantiated with respect to the 
cardiovascular system [6]. 

In the current article we report a comprehensive ap- 
proach to generaHze the concept of pairwise connectivity 
through the quantification of the distribution of paths of 
different lengths between pairs of nodes. The potential of 
such a framework is illustrated with respect to network 
characterization, with respect to both theoretical models 
and real-world networks, as well as community detection. 
Helped by optimal multivariate statistical methods, we 
characterize the relationships between the topologies of 
six distinct complex networks theoretical models and dis- 
cuss the achieved discriminability. In order to illustrate 
the variation of the generalized connectivity in real- world 
networks, we report and discuss results corresponding to: 
(i) the US highway network, (ii) the neural C. elegans 
network [5], (iii) the cat cortical network [16] and (iv) a 
food web of a broadleaf forest in New Zealand [17]. In 



addition, we characterize the network modular structure 
(community) considering respective generalized connec- 
tivity matrices. The projection of the network vertices 
considering an optimal multivariate statistical method 
resulted in vertices belonging to the same communities 
being projected nearby, forming clusters of points. 

In next sections, we provide the basic concepts related 
to network models, paths between nodes, principal com- 
ponent analysis (PCA) and network discriminability. An 
optimal algorithm to find the number of paths between 
pair of vertices is also provided. The illustration of the 
potential of the proposed methodology with respect to 
theoretical and real-world networks, as well as for com- 
munity identification, are presented and discussed subse- 
quently. 

BASIC CONCEPTS AND METHODOLOGY 

An undirected network can be represented by its ad- 
jacency matrix whose elements aij are equal to one 
whenever there is a connection between the vertices i and 
j, or equal to zero otherwise. The number of connections 
of a given vertex i is called its degree fcj , while the cluster- 
ing coefficient cCj, is defined as i.e. cci — 2ni/{ki — l)ki, 
where rii is the number of connections between the neigh- 
bors of i [5]. The number of paths with length h = 
between each pair of nodes can be expressed 
in terms of the three-dimensional matrix R = R(h,i,j) 
(see Figure 2), so that each matrix Rh{iTj) = R{h,i,j) 
gives the total number of paths of length h = 1,2, . . . , H 
extending from node j to node i (observe that i?i = A). 
These matrices will always be symmetric for undirected 
networks. The set of matrices Rh therefore conveys com- 
prehensive information about the generalized connectiv- 
ity between any pair of nodes, therefore providing valu- 
able additional information about the network structure. 
In addition, the shortest path distance matrix can be de- 
rived from such matrices by taking the minimum value 
along all matrices R obtained for all possible h (see Fig- 
ure 2). Therefore, the matrices A and D are special cases 
of the set of matrices Rh. The matrix T, which is ob- 
tained by summing the elements along the set Rh, gives 
the number of paths of lengths h = 1, . . . ,H between ev- 
ery pair of vertices. As such, this matrix quantifies all 
alternative paths between pair of nodes and can be used, 
for instance, in analysis of network resihence. 

An illustration of the several connectivity approaches 
that can be appHed in order to characterize the network 
in Figure 1 is provided in Figures 3 and 4. Figures 3 
(a) and (b) show the traditionally adopted matrices of 
adjacency and the shortest path lengths distances, re- 
spectively. While the adjacency matrix A indicates the 
immediate connectivity between pairs of nodes, the short- 
est path lengths matrix D contains the number of edges 
along the shortest paths between each pair of nodes. On 
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FIG. 2: Networks can be characterized in terms of the three- 
dimensional matrix R = R{H,i,j), which provides a more 
comprehensive description of the network structure than the 
traditional adjacency {A{i,j) = Ri{i,j)) and shortest paths 
length matrices {D{i,j) = min{R{H,i, j))). The matrix T, 
which provides the total number of paths between every pair 
of vertices i and j, can be obtained by summing the elements 
of the matrices Rh{i,j)- 



the other hand, the matrices in Figure 4 are rarely (if 
ever) considered in the Uterature and express other types 
of pairwise interactions between the nodes. In such a fig- 
ure, the matrices i?2, ^3, Ra and i?5 express the number 
of distinct paths of lengths h = 2,h^3,h = 4: and /i = 5 
between each possible pair of nodes in the network in 
Figure 1, respectively. Observe that these matrices make 
explicit important information which cannot be easily in- 
ferred from any of the two previous matrices, A and D. 
For instance, while matrix D indicates that there is only 
one paths of length four between Strasbourg (France) 
and Antwerp (Belgium), the matrix R5 indicates that 
there are four paths of length five and matrix R4, a path 
of length four. Similarly, while the matrix D indicates 
that there is a single shortest path of length two between 
Lyon (France) and Metz (France), the matrix R2, shows 
two paths of length two, and the matrix R3, one path of 
length three, all viable alternatives in the case of eventual 
disruption of the shortest path. The other matrices pro- 
vide information about even longer alternative paths, of 
eventual interest for a tourist who wants to visit several 
nearby places. Therefore, the set of matrices R can pro- 
vide valuable additional information about the network 
structure, leading to more accurate network characteri- 
zation and classification. 



Algorithm for identification of number of paths 
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FIG. 3: The adjacency (a) and distance (b) matrices, respec- 
tive to the network in Figure 1. 
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FIG. 4: The matrices containing the number of paths of length 
ft = 2 (a), /i = 3 (b), /i = 4 (c) and /i = 5 (d) between each 
pair of nodes in the network in Figure 1. 



network. Such an optimal algorithm (each node is vis- 
ited only once) can be applied to direct and undirected 
networks. The operations push{a) and pop{a) place and 
remove the data a into a stack, respectively. Though 
this deterministic algorithm is optimal, it may require 
long periods of time depending on the type of network, 
its size, average degree, as well as the total number of 
steps H required. Stochastic algorithms such as that de- 
scribed in [18, 19] can be considered for estimations in 
such cases. The execution of such an algorithm from all 
vertices on the network yields the set of matrices Rh ■ 



Decorrelation of Measurements and Dimensionality 
Reduction 



The Algorithm 1 allows the identification of all paths 
between a reference vertex i and all other nodes in a 



Because of the relatively high dimensionality of the 
path measurements, especially as a consequence of their 
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Algorithm 1 The general algorithm to obtain the 

number of paths between each pair of vertices, 
for each vertex i do 

next = one of the non-visited immediate neighbors of i; 
sfacfc.pus/i(remainder of non-visited immediate neigh- 
bors of i, h); 
path.push{next) ; 
R{next, i,h) = 1; 

while stack not empty or size{path) > do 
curr — next; 

ng = number of non-visited immediate neighbors of 

curr ; 

if Tig > then 

next = one of the non-visited immediate neighbors 
of curr; 

stack. push{TemaindeT of non-visited immediate 
neighbors of curr, h); 
path.push{next); 
else 

next, h = stack .pop{one node, h); 
node = —1; 

while node 7^ next do 

node = path.popQ; 
end while 
end if 

R{next,i,h) = R{next,i,h) + 1; 
end while 
end for 



parameterization with h, as well as the already observed 
correlations along h, it becomes important to consider 
means for obtaining effective projections of the measure- 
ments (dimensionality reduction) so as to visualize the 
network and vertex separations. This can be optimally 
performed through the method know as principal com- 
ponent analysis (PC A). 

PCA can be defined as the orthogonal projection of 
the original data onto a lower dimensional linear space, 
called the principal subspace, such that the variance of 
the projected data is maximized along its first axes [20]. 
Indeed, PCA can be understood as a rotation of the axes 
of the original variable coordinate system to new orthog- 
onal axes in order to makes the new axes coincide with 
the directions of maximum variation of the original vari- 
ables [21]. In practice, a PCA consists initially of finding 
the eigenvalues and eigenvectors of the sample covari- 
ance matrix [22]. So, let each of Q observations {e.g. 
a node, a pair of nodes, or network), henceforth repre- 
sented as V ~ {1, 2, . . . , Q}, be characterized in terms 
of M respective features or measurements each, repre- 
sented in terms of the feature vector (each element 
fv{i), i G {1,2,..., M}, of this vector corresponds to one 
measurement of the observation v). For instance, we can 
consider the number of paths between each vertex i and 
all other vertices in the network. In this case, each ver- 
tex presents a feature vector with N elements. In cases 
where the number of features is large, it is possible to op- 



timally reduce their dimensionality M by removing the 
correlations between them. This important dimensional 
reduction transformation can be easily implemented by 
using the PCA methodology {e.g. [4, 21]). 

Let the covariance between each pair of measurements 
i and j be given as 

1 ^ 

C{i,.j) = - - M,), (1) 

^ ^ v=l 

where is the average of /„(«) over the Q observations, 
i.e. 

1 

v—l 

The covariance matrix between these measurements is 
defined as C = [C{i,j)], with dimension M x M. Let 
the eigenvalues of C, sorted in decreasing order, be rep- 
resented as Ai, I = 1, 2, . . . , M , with respective eigenvec- 
tors Vi. By stacking such eigenvectors, it is possible to 
obtain the matrix 
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which defines the stochastic linear transformation known 
as Karhunen-Loeve transform [4, 21]. Now, the new fea- 
ture vectors can be obtained from the original measure- 
ment vectors / by making 

a^Gf. (4) 

The variances of the new measurements in g are pro- 
vided by the respective eigenvalues. In case the mea- 
surements are correlated, most of their variances will be 
concentrated along the first elements of g, which is guar- 
anteed by the fact that the PCA completely decorrelates 
the original measurements. Indeed, the PCA is optimal 
with respect to concentrating the variation along the first 
axes. Therefore, it is possible to reduce the dimensional- 
ity of the features vectors by disregarding in the matrix 
in Equation 3 all eigenvectors associated to eigenvalues 
smaller than a given threshold, or by taking only the R 
first eigenvectors. The resulting variables, which are fully 
uncorrected linear combinations of the original measure- 
ments, concentrate the variance of the overall data and 
therefore represent a particularly meaningful characteri- 
zation of the distribution of the original observations. 

CHARACTERIZATION OF THEORETICAL 
NETWORK MODELS 

Six different types of theoretical network models are 
considered in this article. The Erdos-Renyi (ER) ran- 
dom graphs [23] are obtained by connecting TV initially 
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isolated nodes with constant probability p. The tradi- 
tional preferential attachment rule [24] is used to obtain 
the scale-free Barabasi-Albert (BA) networks. Such a 
model is a particular case of the Krapivsky et al. [25] com- 
plex network model, which considers a non-linear pref- 
erential attachment rule to establish connections during 
network growth — the probability of connection is de- 
fined as Vi^j = k°: / fc," , where a is the non-linear 
exponent. Observe that a = I yields the BA model. 
In order to obtain the Watts-Strogatz small-world model 
(WS), each connection in a linear lattice is rewired with 
probability p. Geographical networks (GN) are obtained 
by starting with N nodes distributed uniformly along a 
three-dimensional space and connecting them according 
to distance, i.e. the probability to connect two vertices i 
and j is given by Pij = Xexp{— Xdij) , where A is a param- 
eter to adjust the network degree and c?y is the Euclidean 
distance between i and j. Such a model was introduced 
by Waxman to model the Internet topology [26]. Knitted 
networks (KT) [18] can be obtained by generating ran- 
dom sequences of nodes and connecting them sequentially 
(without repetition). The number of generated sequences 
depends on the network average connectivity. This net- 
work is particularly regular with respect to several of its 
topological and dynamical properties [18, 19]. In the cur- 
rent work, all these networks are grown with parameter 
sets so as to have the same number TV of nodes and ap- 
proximately the same average degree. 

In order to visualize the network distribution and sep- 
aration (discriminability) , the set of networks can be pro- 
jected into a 3D space of decorrelated measurements. 
In the current work, we take into account as original 
measurements the averages and standard deviations of 
each matrix Rh. In this case, if we consider a max- 
imum of H distances, we have a set of 2H measure- 
ments, and each network is represented by a feature vec- 
tor V = {fii,ai, fj.2,cr2, ■ . ■ , ^^h,cfh}, where fih and ah 
are the average and standard deviation of the elements 
in the matrix Rh, respectively. The network projections 
obtained by the PGA reflect the network similarities in 
terms of their respective feature vectors. Indeed, models 
that are mapper nearby in the projected space tend to 
present similar topologies. 

COMMUNITY DETECTION 

Vertices belonging to the same community tend to 
present similar patterns of generalized connectivity, i.e. 
distributions of the number of paths of varying lengths. 
Since the generalized distance matrices provide compre- 
hensive information about the distribution of paths be- 
tween nodes, it can be considered for community detec- 
tion. Thus, each vertex of a given network is represented 
by the feature vector xl corresponding to the respective 
row (or column, as the distance matrices are symmet- 



ric) i in the matrix Rh- Therefore, each element j of 
such vector represents the number of paths of length h 
between i and j. The visuaHzation of the vertex dis- 
tribution can then be obtained by PGA projecting the 
feature vectors into the three-dimensional space. Thus, 
the vertices presenting similar set of attributes tend to be 
projected nearby, giving rise to clusters of points. Each 
of these clusters indicates a possible community in the 
original network. 

RESULTS AND DISCUSSION 

Our first experimental investigation concentrates in 
the characterization and discrimination between the 
topologies of six different complex networks theoretical 
models, namely: (i) the random graphs of Erdos and 
Renyi (ER) , (ii) the small- world network model of Watts 
and Strogatz (WS), (iii) the geographical model of Wax- 
man (GN), (iv) the scale-free model of Barabasi and Al- 
bert (BA), (v) the non-preferential attachment model of 
Krapivsky et al. (NL) and (vi) the knitted network model 
of Gosta (KT). We computed the averages and standard 
deviations of the matrices Rh for h = 1, . . . , 6, for each 
network model realization. In this way, each generated 
network is represented in terms of a vector with 12 ele- 
ments, i.e. the network n is represented by the respective 
vector Vn = {a^i, cti, /i2, cr2, . . . ,^le,a6}, where fXh and ah 
stand for the average and standard deviation of the val- 
ues in the matrix Rh. We generated 25 network reaHza- 
tions for each model and, after standardization[32], we 
projected those networks into the 3D space by applying 
the PGA methodology. As we can see in Figure 5, each 
of the respective types of networks generated by these 
models is represented by independent clusters of points 
(sharing similar topological properties), which indicates a 
clear separation between each network theoretical model. 
While the networks generated by preferential attachment 
rule (BA and NL, orange and cyan points) are organized 
at the right side of the projection, the most regular mod- 
els (KT and ER, gray and blue points) are placed at 
the left side. Indeed, the scale-free networks tend to 
present greater variability of the number of paths than 
the more homogeneous models, once the presence of hubs 
tends to increase the number of paths between pairs of 
vertices and therefore generates a highly inhomogeneous 
path length distribution. In addition, the network mod- 
els that generate networks with more regular structure 
tend to present the smallest cloud dispersions (KT and 
WS, gray and green points). In this way, by providing 
accurate discriminability between different models, the 
generaHzed connectivity approach presents good poten- 
tial for enhancing network characterization and classifi- 
cation. 

In the case of the real-world networks, we applied our 
analysis to: (i) the US highway network, (ii) the neural C. 
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FIG. 5: The projection of the networks generated by the ER 
(blue), WS (green), BA (orange), GN (magenta), KT (gray), 
and NL (cyan) network models in the 3-dimensional space. 

elegans network [5], (iii) the cat cortical network [16] and 
(iv) a food web of a broadleaf forest in New Zealand [17]. 
Details about these networks are given in Table I. Since 
these networks present different number of vertices and 
connections we cannot compare them directly - note that 
the number of paths for the cortical network is higher 
than for the other networks, which is a direct consequence 
of its higher average node degree. In this way, we consid- 
ered the z-score in order to characterize the distribution 
of paths, which is calculated by [27] 

r7 l^h ^random /_\ 

Zh = , (5) 

^random 

where is the average number of paths of length h 
in the real network, and ^random and arandom are the 
average and standard deviation of the number of paths 
in the respective randomized network ensemble, which 
were generated by the configuration model and present 
the same degree distribution as the respective real-world 
network [28]. The obtained results for the four network 
are presented in Table I. It is interesting to note that just 
the neural network of the nematode C. elegans, which is 
the only case of a nervous system completely mapped at 
the level of neurons and chemical synapses [5], presents 
larger number of paths of lengths h = 2,3 and 4 than the 
randomized counterparts. For h > A, the randomized 
versions present higher number of paths. This suggests 
that connections of length 2, 3, and 4 could be more im- 
portant for allowing proper dynamics in the C. elegans 
network. The highest difference for /i = 3 suggests that 
the evolution of the neuronal organization in this species 



tended to favor the alternative connections of length 3, 
while avoiding longer range connections. On the other 
hand, in case of the food web, the cortical network and 
the US highway, the z-scores tended to decrease with 
h, which indicates that such networks tend to present 
smaller number of paths of length h > 2 than their ran- 
domized versions. Particularly, since food web tend to 
present a small number of trophic levels, there are no 
paths of length h > 4, while the randomized version can 
display longer path sizes. Indeed, the small network di- 
ameter is a direct consequence of the energy transmission 
between trophic levels [29]. In the case of the highway 
network, the fact that the randomized versions tended to 
present larger number of paths than the respective real- 
world version is a direct consequence of the fact that the 
connections in geographical highway network tend to be 
constrained by the adjacency between neighboring local- 
ities. 

Our final analysis concentrated on the relationship be- 
tween the modular network organization and the distri- 
bution of the number of alternative paths. Since vertices 
in the same community tend to present similar sets of 
more strongly connected nodes, the number of paths be- 
tween the vertices in the same module tends to be large. 
We applied the proposed methodology described in Sec- 
tion to the Zachary karate club network and to an ar- 
tificial modular network, which have been widely used 
as tests for community structure algorithms (e.f/.[30]). 
The karate club network was constructed with the data 
collected observing 34 members of a karate club over 
a period of 2 years and considering friendship between 
members [31]. On the other hand, the artificial network 
was generated as described in [30], where a set of N ver- 
tices is divided into c communities. Then, each vertex 
is connected to Zin vertices in the same community, and 
Zout vertices in the other communities. The connections 
between communities are distributed uniformly. In the 
current work, we considered N = 128, c = 4, Zm = 10 
and Zout = 6. From these networks, we calculated the 
respective Rh matrices for h = 1,2 and 3. After stan- 
dardization of the feature vectors, we applied the PCA 
and obtained the projections presented in Figure 6 and 
7 for the karate and the artificial modular networks, re- 
spectively. In case of the Zachary karate club network, 
the best identification of the communities was obtained 
for h = 2, where the classification of the vertices into the 
two clusters corresponds precisely to the actual division 
of the club members. The case h = 1, which considers the 
traditional adjacency matrix, does not provide an accu- 
rate separation of the communities into different clusters. 
For h > 3, the separation is worse than for h = 2 be- 
cause the network presents a very small average shortest 
distance {£ = 2.3). Considering the shortest path ma- 
trix, the discriminability also resulted worse than that 
obtained for i?2- In the case of the artificial modular 
network, the best separation between the communities 
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TABLE I: The z-scores and the average number of paths obtained for the real-world networks. 



Network 


N 


(fc> 


Z2 


{R2) 


^3 




Z4 


{R4) 


Z5 


(R^o) 


Ze 


(^6) 


Food web 


78 


3.1 


0.002 


0.05 


-0.087 


0.03 


-0.12 





-0.13 





-0.11 





Cortical net. 


53 


15.5 


-0.027 


5 


-0.043 


85 


-0.06 


1400 


-0.08 


21800 


-0.11 


331000 


Neural net. 


297 


7.9 


0.026 


0.30 


0.045 


3 


0.01 


25 


-0.05 


210 


-0.10 


1750 


US Highway 


284 


6.0 





0.02 


-0.048 


2 


-0.06 


13 


-0.06 


100 


-0.06 


680 




(b) 



FIG. 6: The original separation between the two classes of 
karate club member (a), and the projection into the three- 
dimensional space of the generalized matrix R2 . 




(a) 




FIG. 7: (a) The artificial network containing four communi- 
ties and (b) the projection of the respective matrix R2 into the 
three dimensional space considering the PGA methodology. 



was also obtained ioi h — 2, although ioi h — 3 and 
h ~ 4: the separation into four respective clusters is still 
clear. For the traditional matrices A and D, two com- 
munities were all joined into the same cluster, therefore 
completely undermining the separation. It is interesting 
to note that most community finding algorithms cannot 
determine the communities perfectly for Zout = 6 [3]. 
Therefore, the consideration of the alternative paths can 
provide more information about the network structure 
and organization. 



CONCLUDING REMARKS 

The concept of connectivity underlies great part of the 
complex networks research. However, connectivity has 
typically been understood and quantified in terms either 
of strictly local measurements, such as the local degree, 
or by considering shortest path lengths. Though more 
global, the latter feature fails to take into account alter- 
native pathways between pairs of nodes, which are ex- 
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tremely important in influencing the topological proper- 
ties of the networks. For instance, the presence of more 
than one path between two nodes tends to increase the 
interaction between them and consequently raises their 
communication robustness under edge disruption. 

In the current paper, we analyzed the generalized net- 
work connectivity with respect to the characterization of 
six network models and four real-world networks, as well 
as for community flnding. We showed that the consid- 
eration of the alternative paths between vertices tends 
to provide an accurate network topology discriminabil- 
ity, as observed for the networks generated by the differ- 
ent models. The analysis of real-world networks suggests 
that the long range connectivity tend to be limited in 
those networks and may be strongly related to network 
evolution and organization. In addition, we studied how 
the distribution of the number of paths is related to net- 
work modular structure. The obtained results indicate 
that the proposed approach particularly promising for 
community identification. Indeed, a possibility for future 
work would be the improvement of the community anal- 
ysis considering clustering methods to separate the cloud 
of points obtained in the projection, such as A:— means 
or agglomerative hierarchical clustering [21]. In addi- 
tion, pattern recognition approaches can be considered 
in order to quantify the separation between several types 
of networks models and therefore provide complex net- 
works taxonomies. In this case, real-world networks can 
be associated to the most likely theoretical model, as de- 
scribed in [4] . Studies relating the number of paths with 
network dynamics constitute another promising research 
possibility. 
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